Deep Layer Aggregation
نویسندگان
چکیده
Visual recognition requires rich representations that span levels from low to high, scales from small to large, and resolutions from fine to coarse. Even with the depth of features in a convolutional network, a layer in isolation is not enough: compounding and aggregating these representations improves inference of what and where. Architectural efforts are exploring many dimensions for network backbones, designing deeper or wider architectures, but how to best aggregate layers and blocks across a network deserves further attention. Although skip connections have been incorporated to combine layers, these connections have been “shallow” themselves, and only fuse by simple, one-step operations. We augment standard architectures with deeper aggregation to better fuse information across layers. Our deep layer aggregation structures iteratively and hierarchically merge the feature hierarchy to make networks with better accuracy and fewer parameters. Experiments across architectures and tasks show that deep layer aggregation improves recognition and resolution compared to existing branching and merging schemes. Representation learning and transfer learning now permeate computer vision as engines of recognition. The simple fundamentals of compositionality and differentiability give rise to an astonishing variety of deep architectures [9, 17, 15, 6, 22]. The rise of convolutional networks as the backbone of many visual tasks, ready for different purposes with the right task extensions and data [4, 14, 19], has made architecture search a central driver in sustaining progress. The ever-increasing size and scope of networks now directs effort into devising design patterns of modules and connectivity patterns that can be assembled systematically. This has yielded networks that are deeper and wider, but what about more closely connected? More nonlinearity, greater capacity, and larger receptive fields generally improve accuracy but can be problematic for optimization and computation. To overcome these barriers, different blocks or modules have been incorporated to balance and temper these quantities, such as bottlenecks for dimensionality reduction [11, 17, 7] or residual, gated, + Dense Connections Feature Pyramids Deep Layer Aggregation Figure 1: Deep layer aggregation unifies semantic and spatial fusion to better capture what and where. Our aggregation architectures encompass and extend densely connected networks and feature pyramid networks with hierarchical and iterative skip connections that deepen the representation and refine resolution. and concatenative connections for feature and gradient propagation [7, 16, 8]. Networks designed according to these schemes have 100+ and even 1000+ layers. Nevertheless, further exploration is needed on how to connect these layers and modules. Layered networks from LeNet [10] through AlexNet [9] to ResNet [7] stack layers and modules in sequence. Layerwise accuracy comparisons [3, 23, 14], transferability analysis [20], and representation visualization [23, 21] show that deeper layers extract more semantic and more global features, but these signs do not prove that the last layer is the ultimate representation for any task. In fact, skip connections have proven effective for classification and regression [8, 1] and more structured tasks [5, 14, 12]. Aggregation, like depth and width, is a critical dimension of architecture. In this work, we investigate how to aggregate layers to better fuse semantic and spatial information for recognition and localization. Extending the “shallow” skip connections of current approaches, our aggregation architectures incorporate more depth and sharing. We introduce two structures for deep layer aggregation (DLA): iterative deep aggregation (IDA) and hierarchical deep aggregation (HDA). These
منابع مشابه
Iterative Deep Aggregation Hierarchical Deep Aggregation
Architectural efforts are exploring many dimensions for network backbones, designing deeper or wider architectures, but how to best aggregate layers and blocks across a network deserves further attention. We augment standard architectures with deeper aggregation to better fuse information across layers. Our deep layer aggregation structures iteratively and hierarchically merge the feature hiera...
متن کاملNumerical and Experimental Investigation of Deep Drawing Process in Square Section of Single-Layer and Two-Layer Sheet
Deep drawing of two-layer sheet is a suitable way to achieve product with a desired shape and desired properties in sheet metal forming technology. Control of deep drawing parameter such as thinning is the most important challenge in this process. The most difficult part of this challenge is differences in material properties and geometry of each layer. In this paper, numerical approach has bee...
متن کاملUnsupervised Semantic-based Aggregation of Deep Convolutional Features
In this paper, we propose a simple but effective semantic-based aggregation (SBA) method. The proposed SBA utilizes the discriminative filters of deep convolutional layers as semantic detectors. Moreover, we propose the effective unsupervised strategy to select some semantic detectors to generate the “probabilistic proposals”, which highlight certain discriminative pattern of objects and suppre...
متن کاملLearning a Robust Representation via a Deep Network on Symmetric Positive Definite Manifolds
Recent studies have shown that aggregating convolutional features of a pre-trained Convolutional Neural Network (CNN) can obtain impressive performance for a variety of visual tasks. The symmetric Positive Definite (SPD) matrix becomes a powerful tool due to its remarkable ability to learn an appropriate statistic representation to characterize the underlying structure of visual features. In th...
متن کاملEnergy Absorption Analysis and Multi-objective Optimization of Tri-layer Cups Subjected to Quasi-static Axial Compressive Loading
In this paper, the energy absorption features of tri-layer explosive-welded deep-drawn cups subjected to quasi-static axial compressive loading are investigated numerically and experimentally. To produce the cups, tri-layer blanks composed of aluminum and stainless steel alloys were fabricated by an explosive-welding process and formed by a deep drawing setup. The quasi-static tests were carrie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.06484 شماره
صفحات -
تاریخ انتشار 2017